Introduction to Generative Modeling: Moving Beyond Discrimination

We are transitioning from discriminative modeling, which solved classification and regression tasks by learning the conditional probability $P(y|x)$, to the sophisticated domain of generative modeling. Our core objective now shifts to density estimation: learning the complete underlying data distribution $P(x)$ itself. This fundamental shift allows us to capture the intricate dependencies and complex structure within high-dimensional datasets, moving beyond mere boundary separation to true data understanding and synthesis.

1. The Generative Objective: Modeling $P(x)$

The goal of a generative model is to estimate the probability distribution $P(x)$ from which the training data $X$ originated. A successful generative model can perform three crucial tasks: (1) Density Estimation (assigning a probability score to an input $x$), (2) Sampling (generating entirely new data points $x_{new} \sim P(x)$), and (3) Unsupervised Feature Learning (discovering meaningful, disentangled representations in a latent space).

2. Taxonomy: Explicit vs. Implicit Likelihood

Generative models are fundamentally categorized by their approach to the likelihood function. Explicit Density Models, such as Variational Autoencoders (VAEs) and Flow Models, define a mathematical likelihood function and attempt to maximize it (or its lower bound). Implicit Density Models, most famously Generative Adversarial Networks (GANs), bypass the likelihood calculation entirely, learning instead a mapping function to sample from the distribution $P(x)$ using an adversarial training framework.

Data Synthesis and Feature Interpolation

Generative models demonstrate their capability by generating novel, high-fidelity instances (e.g., unseen faces, complex textures) or by allowing semantic interpolation in the learned latent space, illustrating the model's grasp of data variability.

Examples of AI-generated faces and interpolated features.

Question 1

In generative modeling, what is the primary distribution of interest?

$P(x)$

$P(y|x)$

$P(x|y)$

$P(y)$

Question 2

Which type of generative model relies on adversarial training and avoids defining an explicit likelihood function?

Variational Autoencoder (VAE)

Autoregressive Model

Generative Adversarial Network (GAN)

Gaussian Mixture Model (GMM)

Challenge: Anomaly Detection

Leveraging Density Estimation

A financial institution has trained an explicit density generative model $G$ on millions of legitimate transaction records. A new transaction $x_{new}$ arrives.

Goal: Determine if $x_{new}$ is an anomaly (fraud).

Step 1

Based on the density estimate of $P(x)$, what statistical measure must be evaluated for $x_{new}$ to flag it as anomalous?

Solution:
The model must evaluate the probability (or likelihood) $P(x_{new})$. If $P(x_{new})$ falls below a predefined threshold $\tau$, meaning the new point is statistically improbable under the learned distribution of normal transactions, it is flagged as an anomaly.